Thursday, February 13, 2025

SearchResearch Commentary (2/13/25): Using NotebookLM to help with DeepResearch

 Let me tell you a short story... 

A red-stained schistosoma worm. The male is the small one,
spending his entire life within the groove of the female.  P/C CDC.

As you know, I’m writing another book–this one is about Unanticipated Consequences of the choices we made.  (Here’s my substack that I’m using to try out various sections.) 

And, as you also know, I have a deep interest in Egypt.  So, on a recent visit there, I visited the Aswan High Dam (which we discussed earlier in SRS in conjunction with its creation of Lake Nasser) and thought about what were some of the Unanticipated Consequences of building that massive dam?  This led to our SRS Research Challenge about “What has been the effect of the creation of Lake Nasser on the incidence of schistosomiasis in Egypt?”  


A little background: Schistosomiasis (aka snail fever, bilharziasis) is a disease caused by several species of Schistosoma flatworms, any of Schistosoma haematobium, S. japonicum, and S. mansoni. Transmission can occur when humans are exposed to freshwater sources that are contaminated with Schistosoma parasites.


People infected with Schistosoma flatworms shed eggs in their urine or stool. In fresh water, immature flatworms called miracidia hatch from eggs. The miracidia find and infect the tissues of freshwater snails where they mature into an infective stage known as cercariae. The free-swimming cercariae are released from the infected snail into the surrounding water. Cercariae enter their human host by burrowing though skin that is exposed to contaminated water. Male and female parasites migrate to either the liver, the lower intestines, or bladder, where they release eggs into feces or urine.

Symptoms of schistosomiasis are a result of the body’s immune response to the flatworms and their eggs. Symptoms include itching at the cercariae entry sites, fever, cough, abdominal pain, diarrhea, and enlargement of the liver and spleen. In some cases, infection may lead to lesions in the central nervous system.



In that previous post we did a little comparison of the different Deep Research tools out there.  At least, we looked at how Deep Research search tools might work.  I promised to look at some other analysis tools.  This is my briefing on using Google’s NotebookLM.  

Full disclosure: Some of my Googler friends currently work on NotebookLM. I've tried to not let that influence my writing here.


How NotebookLM works:  The core idea of NotebookLM (NLM) is that you create a notebook on a given topic, download a bunch of sources into it, and then “have a conversation” with your sources.  


Think of it as though you’ve got a research assistant–you give them a pile of content (Google Docs, Slides, PDFs, raw text notes, web pages, YouTube videos, and audio files).  Then you can chat with them about the stuff that’s in there.  It’s as though NLM has just read everything and understands it reasonably well.  This is handy if you’ve got a lot of content to use.  (In the AI biz this is called “Retrieval Augmented Generation,” which means that the text generation process is driven primarily by the contents of the sources you’ve uploaded.)  


So… by using our standard SRS methods, I searched for good online content (mostly PDFs of medical articles, mostly found by using Google Scholar) and uploaded it to NLM.  


I’ll spare you the long  story here… but I spent WAY too much time finding low quality content that initially looked good, but turned out to be puff pieces published in low quality journals.  (I assume that you, our SRS regulars, know how to find this out!) 


I ended up with a screen that looked like this (the red rounded-rectangles and circuled numbers are mine):

Click to enlarge for details

In panel 1, I’ve uploaded 9 different papers–one of which was my SRS post from last month about Egypt.  I picked the 8 papers as being high quality about the effects of the building of the Aswan Dam on the incidence of Schistosomiasis in Egypt.  

Once I loaded them up, I chatted with NLM about the papers a bit in panel 2 and then asked my key question:  

    [ did the incidence of schistosomiasis increase or         decrease after the Aswan Dam was built? ] 

And then: 

    [ What were the predicted effects of the Aswan High         Dam on schistosomiasis? ] 

NLM then gave me a quite reasonable answer based on the documents I’d uploaded as sources.  As you can see in the image below, NLM gives citations to places where it extracted the text when you roll over the gray citation number.  


Rolling over one of the citation number causes a pop-up that shows exactly where the
supporting text is in the list of sources.  

Now, I have to admit that I read these 9 papers, so I have a good idea what’s in them.  And I’d written a draft of the answer to that question.  Somewhat to my surprise, the answer was quite good–it’s very much what I would have written (although perhaps not in the style that you see.  Here’s the start of the NLM reply to my question (and a link to the full doc): 


After the Aswan High Dam (AHD) was built, the incidence of schistosomiasis in Egypt decreased overall, despite initial concerns that the dam would cause it to increase [1-4].

  • Initial Concerns: It was predicted that the impoundment of the Nile would create ideal conditions for snails, which are intermediate hosts for the schistosomiasis parasite [5-7]. It was also thought that steady water flow downstream from the dam would increase human contact with the water, leading to more contamination [5, 6].


  • Public Health Measures: The implementation of public health programs played a critical role in reducing schistosomiasis [4, 8-10]. These programs included:

  • Installation of clean fresh water pumps [11, 12].

  • Treatment of canals and drainage ditches with copper sulfate [11, 12].

  • Mass distribution of the drugs metrifonate and praziquantel [11-14].

  • Mollusciciding to reduce snail populations [15].


… etc … 


This is pretty good, but I have a couple of quibbles. 


The citations are both wonderful and annoying.  The wonderful part comes from being a very specific citation.  As you see in the above image, NLM tells you the paper it found AND specifically the supporting text.  That’s really nice. 


On the other hand, you have to go through the note citation-by-citation, one at a time, to figure out that 1, 3, 5, 6, and 8 all are to the paper “Effects of the Aswan Dam.pdf” in the sources.  And, at this time, I haven’t been able to figure out how to get NLM to give me a full citation list–you know, something that looks like this: 


Abd-El Monsef, H., Smith, S. E., & Darwish, K. (2015). Impacts of the Aswan high dam after 50 years. Water Resources Management, 29, 1873-1885.


I hope that gets added to the list of things NLM will do, but at the moment, it’s kind of a laborious process to figure out the mapping from citation numbers to actual papers. 


But perhaps the most impressive piece of this analysis by NLM was the summary (emphasis by NLM): 


In summary, while the Aswan High Dam did cause a change in the types of schistosomiasis seen in Egypt, and there were initial fears the disease would increase, overall schistosomiasis decreased significantly because of public health interventions [9]


That’s the conclusion I came to as well, and it’s definitely NOT the first thing you’d find with just a simple Google search.  In fact, before the dam was built, an  increase in schistosomiasis was predicted by epidemiologist Van Der Schalie who wrote: “...there is evidence that the high incidence of the human blood fluke schistosomiasis in the area may well cancel out the benefits the construction of the Aswan High Dam may yield (Farvar and Milton 1972)”. It was widely thought that schistosomiasis would increase as farmers converted from basin irrigation system to perennial irrigation and so had more water to irrigate with.

But with such predictions widely known, the Egyptian government began a far-ranging program of killing snails, cleaning up waterways, and giving much of the population chemotherapy against schistosomiasis.  


In 1978, soon after the AHD was commissioned, a baseline epidemiological survey on over 15,000 rural Egyptians from three geographical regions of Egypt (Nile Delta, Middle Egypt and Upper Egypt), plus the resettled Nubian population showed that the prevalence of schistosomiasis was 42 % in the North Central Delta region, 27 % in middle Egypt and 25 % in Upper Egypt.  That sounds bad, but it’s a massive reduction in disease rates.

So, overall, the control program worked pretty well, with infection rates dropping to record lows.  

The thing is, the predictions about the consequences about building the dam were right–but they counteracted the increase by anticipating the consequences and being proactive about fixing them. 


Wait!  There’s more: 

In addition to this chat / query process, you can also ask NLM to create a study guide, an overall summary, or a briefing doc.  NLM gives you prompts to use (roughly “Create a study guide from these sources…” or “Summarize these sources in a readable format.”)  

As an expert SearchResearcher, you can also chat with just one document to get JUST that document’s perspective (and, implicitly, that author’s point-of–view).  

Give this a try!


SearchResearch NotebookLM tips:

  1.  Make sure the content you upload as sources is high quality. If you have low quality content, that will surface in the answers you ask NLM.

  2. If you lose the connection between the source you uploaded to NLM, it’s a pain to restore. You have to remember it.  Be sure the uploaded source has the identifying info in it.  (I always put the citation and the original source URL at the top of the source text.) 

  3. If you ask a question about something that’s NOT covered by the sources, the quality will drop.  Stay on topic and try to not overdrive your headlights.   

  4. I haven't tried using NLM without carefully selecting the sources that go into the collection, but I can imagine a use-case where you add multiple sources in and then ask for the plus-and-minus analysis. That's an interesting experiment--let us know if you try it.





Keep searching. 

Thursday, February 6, 2025

SearchResearch (2/6/2025): SearchResearch, Search, and Deep Research

I wasn't surprised... 

"Deep Research" as imagined by Google Gemini.
I especially like the researcher writing simultaneously with both hands. 
I can't do that, can you? 

... when "Deep Research" suddenly became the thing all the cool AI kids were talking about.   It's the inevitable next step in "Search" when you live in a world full of LLMs going their thing.  

What is Deep Research?  

You know how regular web search works. The whole premise of SearchResearch has been to figure out how to do deeper, more insightful research by using the search tools at hand.  

"Deep Research" uses an LLM to do some of that work for us.  Is it the next generation of SRS tools?  

In effect, "Deep Research" is AI-guided search + LLM analysis to make multi-step investigations. The goal is to do some of the SearchResearch heavy lifting for you. 

The key new idea of "Deep Research" is that the LLM AI uses its "reasoning model" to create a multi-step plan that it then executes to find, consolidate, and analyze information for you.  This actually works pretty well because the DR system breaks the large research task down into multiple steps, does a bunch of searches to find relevant documents, then answers the question in the context of those retrieved documents.  It's fairly clever.  

You'll probably read that this is one of the first uses of "Agent AI" (aka "agentic"), but it's kinda not really that.  I think most people agree that an AI agent is when an AI system uses another system to accomplish a task.  For instance, an AI agent could login into your favorite airline reservations system and buy a flight on your behalf.  

DR "agents" run queries for you and do analyses of texts that they pull off the web.  At the moment, no DR agents can log you into a paywalled system and purchase an article for you.  THAT would be agentic, and (here's a prediction for you) it'll happen.  But what we see today are very simple pulls from resources, not full agency.  

Henk van Ess has done a nice piece of work contrasting 4 different "Deep Research" systems.  To wit: 

a. Gemini Advanced 1.5 Pro with Deep Research

b. OpenAI’s Deep Research (via ChatGPT Pro)

c. Perplexity’s Deep Research Mode 

d. You.com’s Research Feature 

As you might expect in these days of hurry-up and copy-what-everyone-is-doing, there will be more Deep Research systems in the next few weeks.  (HuggingFace just announced that they made their DR system that "cloned OpenAI's Deep Research system" in just 24 hours.  Reading their blurb on how they did this is interesting. (Even better: Here's their Github repo if you want to play with it yourself.)  

I agree with much of Henk's analysis, but here's my take on the current crop of DR systems.  

There are 3 big issues with DR systems from my perspective.  

1. Quite often the text of the analysis includes all kinds of tangential / off-topic stuff.  If you just blast through the text of the analysis, you'll miss that it's full of things that are true, but not really relevant.  The writing looks good, but is fundamentally hollow because it doesn't understand what it's writing about.  

2. If you ask a silly research question, you'll get a silly answer.  As Henk points out in his analysis, asking a DR system about the "banana peel stock market theory" will generate plausible sounding, syntactically correct, but ultimately meaningless garbage.  

Interesting: if you do an ordinary Gemini search with his question: 

 [Is there a correlation between banana peel speckle distribution and stock market fluctuations in countries that don’t grow bananas?]

you'll get a succinct "There is no known correlation between banana peel speckle distribution and stock market fluctuations in countries that don't grow bananas."

But if you ask Gemini 1.5 Pro with Deep Research you'll get a 2200 word vaguely humorous essay with data tables, a section on fungal diseases, and a list of stock market fluctuations.  Even though ultimately the summary of the long-form essay is "Well, based on the available research, the answer is a resounding "probably not."" Gemini DR sure takes its sweet time getting there.  

And Gemini handles this better than other DR systems, which generate profoundly idiotic (but polished!) answers. 

Deep point here: If your question doesn't make sense, neither will the answer. 
 

3. You have to be careful about handing your cognitive work off to an AI bot. It's simple to ask a question, get an answer, and learn absolutely nothing in the process.  

This is a caution we've been saying in SRS since the very beginning. But now it's even more important.  Just because something LOOKS nice doesn't mean that it's correct. And just because you've used the tool to answer a question doesn't mean you actually grok it.    

What's more, if you haven't done the research yourself, you won't understand all of the context of the findings--you'll have none of the subtleties and associations that come with doing the work yourself.  

You do NOT want to be in the position of this guy who was able--with the help of AI--to write a great email, but obviously has no idea how to actually do the thing he proposes to do.  (Link to the Apple Intelligence video.) 


Be careful with the tools you use, amigos.  They can have a profound effect on you, even if you didn't intend for that to happen.


And keep searching--there's so much left to learn.  

  



Wednesday, January 29, 2025

Answer: What building is this?

There's an important cautionary tale this week.  AIs sometimes get it wrong... especially with image search...


A building in downtown Denver. P/C Dan Russell 


Last week we asked a relatively simple question:  What building is this?  (See above.) 

If you do a regular old Google Image Search, you get the right answer:  This is the El Jebel Shrine, aka the Sherman Event Center in Denver, Colorado. That’s great, and exactly what you’d expect.  Easy peasy.







If you search on Bing Image Search, you also get the right answer: 




But I wouldn’t write an SRS post about something so obvious. 


What IS surprising is what happens when you ask your favorite AI / LLMs about this image.  


Mostly, they get it very, very wrong.


Oddly, after getting it right with Bing Image Search, Microsoft's Copilot gives a terrible answer.  (Once again, the left hand does not know what the right hand is doing.)  




 

This is NOT the Denver Athletic Club.  (It is also made of brick and has arches, but no domes or minaret-like towers.)  


But if you use the visual description ability of ChatGPT, you get another very wrong answer (mostly because it’s trying to find buildings near my Palo Alto, California location–an assumption that seems really bad… especially since none of the buildings it suggests look anything like the image I asked about! 





I thought that maybe I should give ChatGPT a hint, telling it that the building was in Denver. But that didn't work either. In fact, the answers got worse. The buildings it suggested aren't anywhere near the search target!




Maybe I'm just asking the wrong LLM?


Here's Claude's reply:



Claude gets the style correct, but not a proper identification.


I then asked Google Gemini what the answer might be.  AGAIN:  It too is really wrong: 




I know the Advanced Medicine Center Building in Palo Alto--it looks nothing like this.


Well, I thought, that’s because I’m using Gemini 1.5 Pro.  (As you know, there are multiple Gemini models to choose from...)


But when I switched to Gemini 2.0 Flash Experimental (the latest!), I got an even more wrong answer, even though it has a “might not work as expected” disclaimer.  Indeed!  





The Mosque of Ibn Tulun looks a little like the image with domes and towers, but the color, layout, and materials are all wrong.  


HOWEVER, when I tried Gemini 2.0 Experimental Advanced, I finally got the correct answer. 





Notice that it took 3 different tries with different Gemini models to get to the right answer.  That’s not encouraging.  We know that most people will simply accept the first result and not do any follow-on checking.  


Since I took the photo I’ll tell you: this really IS the Sherman Street Event Center, aka the El Jebel Shrine, aka the Rocky Mountain Consistory, and as the Scottish Rite Temple is a historic building in the North Capitol Hill neighborhood of downtown Denver. 





Here’s a great article about it: https://denverite.com/2024/05/09/el-jebel-shriner-mosque-photos/ 


The Moorish-inspired building was constructed in 1907, as a meeting hall for the El Jabel chapter of the Ancient Arabic Order of the Nobles of the Mystic Shrine (the Shriners). It was never a true mosque in the Islamic sense. In 1924, having outgrown the building, the Shriners sold it to the Scottish Rite Masons, who renamed it. In 1995, the Scottish Rite sold the building to Eulipions, Inc. who converted it into a catering and events facility, and it’s been bouncing around the Denver real estate market ever since.  (Although it seems to have recently landed a permanent owner.) 


This is a beautiful example of the Moorish Revival Architecture movement of the early 20th century in the US. This stylistic movement is a variation of Islamic architecture that was introduced in the 17th and 18th centuries in Spain. It's characterized by geometric shapes, arches, and decorative elements like arabesques and ceramic tiles.


Bottom line: The AIs are mostly wrong.  Regular search-by-image is much better.  (Oddly, Tineye.com found nothing, not even a near miss!)  


But what’s worse is that they’re CONFIDENTLY wrong.  There’s no hesitation, no questioning of the plausibility of the results.  When Microsoft CoPilot says that this is the “Denver Athletic Club,” it doesn’t say that it “might be” the club, or that “this image looks a great deal like the Athletic Club, but I’m not 100% sure.”  


Which is disappointing.  


Search Research Lessons


1. Don't trust any image identification that the LLMs give you. We've talked about this before, but REALLY... if they can't identify a very visually distinct building, I wouldn't trust their results when foraging for mushrooms, berries, or edible plants! Perhaps one day they'll improve their accuracy... but for the time being, be sure to double-check everything!


As always,


Keep Searching!







Thursday, January 23, 2025

SearchResearch Challenge (1/22/25): What building is this?

 This should be a simple question... 

P/C by Dan Russell

... but it turns out that identifying this building is either really simple... or really hard... it all depends on which tools you use!  

This leads to our SRS Challenge today: 

1.  What is the name of this building and where is it?  What style is it?  (For extra credit--why was it made in this rather elaborate style?  

It shouldn't take you long to find the CORRECT answer, but many of the tools out there in use today are giving really terrible responses.  Can you filter out the wheat from the chaff?  

Let us know what you find, and how you know that it's the right answer!

Keep searching!